The evolution of cooperation in the unidirectional linear division of labour of finite roles

Evolution of cooperation is a puzzle in evolutionary biology and social sciences. Previous studies assumed that players are equal and have symmetric relationships. In our society, players are in different roles, have an asymmetric relationship and cooperate together. We focused on the linear division of labour in a unidirectional chain that has finite roles, each of which is assigned to one group with cooperators and defectors. A cooperator in an upstream group produces and modifies a product, paying a cost of cooperation, and hands it to a player in a downstream group who obtains the benefit from the product. If players in all roles cooperate, a final product can be completed. However, if a player in a group chooses defection, the division of labour stops, the final product cannot be completed and all players in all roles suffer damage. By using the replicator equations of the asymmetric game, we investigate which sanction system promotes the evolution of cooperation in the division of labour. We find that not the benefit of the product but the cost of cooperation matters to the evolutionary dynamics and that the probability of finding a defector determines which sanction system promotes the evolution of cooperation.


Introduction
Organisms including humans, plants and bacteria cooperate [1][2][3]. Especially, human society is built on large-scale cooperation [4,5]. In the modern world, humans can often work with any other human from around the globe, regardless of country, ethnicity, language and genetics. However, the evolution of cooperation in human society has remained somewhat puzzling.
The Prisoner's Dilemma [6] (hereafter, PD game) can describe why individuals fail to cooperate; if both are cooperators, their payoff is R; if both are defectors, their payoff is P. The payoff of a defector playing the PD game with a cooperator (T) is higher than the cooperator (S), and the order of the payoffs is T > R > means of 'the evolution of cooperation'. As the payoffs of different players in different groups are asymmetric, this system can be modelled by the replicator equations for asymmetric games. They made concrete assumptions to fit the real industrial dumping system, and investigated if either of two existing sanction systems, namely the producer responsibility system and actor responsibility system, can promote the evolution of cooperation by means of the replicator equations for asymmetric games. In the former system if defection happens in the linear chain, whoever defects, the player in the first group gets punished by the supervision. In the later system, if defection happens, the defector is detected and gets punished by the supervision. It was shown that the sanction systems, especially the producer responsibility system when it is almost impossible to monitor and detect defectors, can promote cooperation more than the actor responsibility system. Hereafter, the former sanction system is called the first role sanction system; the latter is the defector sanction system.
In this study, we generalize the three groups model into any countable number of groups which are in line. We assume that players in one group play one role and that a player in an upstream group interacts with a player in a downstream group. We also generalize the model which can be applied to other systems besides the industrial damping system and make a simple assumption. We will investigate whether the sanction systems can promote the evolution of cooperation or not.
There are some evolutionary game theoretical studies that seem similar to our framework. The effect of network structures on creating cooperation in asymmetric social interactions has been studied (e.g. [25,54]). Their studies seem to include ours; however, it is not true. In their studies, each player is in each vertex, a player imitates the strategy of the neighbour who is a partner in the game if the payoff is higher than that of the player. For example, Su et al. [25] concentrated on how edge orientation, directionality, regularity and properties of graphs will change the evolution of cooperation, assuming that the cooperation cost and benefit for all players are the same. While in our study, each group is in the line and has a large number of players. A player imitates the strategy of others in the group. A player chosen randomly from an upstream group never plays the game with another player in the same group but does with a player chosen randomly in the downstream group. The players in different roles have different cooperation costs and benefits. Therefore, from the viewpoint of mathematical modelling, our study proposes a model which the previous studies about the evolution of cooperation in the network structure have not assumed.
We have only compared our study with the previous ones which assumed that each player is located at each node of the network. There are studies about the evolution of cooperation assuming that each group is located at each node of the two-dimensional lattice, and players move to their neighbouring groups (e.g. [55]). These studies did not assume a player in a group plays the game with another in a different group.
In our model, players play the game sequentially, but our game is different from the sequential game studied previously in economics where the information of the previous player is available to the later player (e.g. [56,57]). In 'the two-person sequential game', for example, a player chooses the strategy after the opponent chooses the strategy, and upon that information changes his or her strategy [56,57]. After two players choose their strategy, they can receive the payoff. Therefore, our study gives a new perspective that the division of labour can be studied from the viewpoint of 'the evolution of cooperation'. Moreover, we can propose a model which the previous studies had not assumed and analysed from the viewpoint of the mathematical models in 'the evolution of cooperation'.

Baseline system
We present the baseline system (see figure 1), where there are n roles ðn [ N and n ! 2Þ. Here N is the set of natural numbers. For n = 1, there is no linear division of labour. There are n groups in the whole population. Each group is allocated to one role, and the group size is infinite. Each group consists of cooperators and defectors. We define cooperators as players who only cooperate, and defectors as players who only defect. We do not assume a mixed strategy in which players choose either cooperation or defection by probability. The frequency of cooperators in group i is i c and the frequency of defectors is i d . Here, i c + i d = 1. It is assumed that one player chosen randomly from the group i interacts with a player chosen randomly from the group i + 1 (1 ≤ i < n).
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 We consider that the final product or service is produced through the division of labour; if players in all roles cooperate together to produce the product or service, the final product or service can be completed (figure 1a). A cooperator in an upstream group produces and modifies the product or service, paying a cost of cooperation, and gives it to a player in a downstream group. Let x i be defined as a cost of cooperation by a cooperator in the group i. The value of the product or service is regarded as the benefit to the player in the group i + 1, b i+1 . In this case, the net benefit of cooperators in the group i is b i − x i (table 1).
For the player in the first group, the benefit comes from the source. The net benefit of the cooperator in group 1 is b 1 − x 1 . In the nth group, a cooperator pays a cost, x n , to produce the final product, and the payoff of the cooperator is b n − x n , where the player sells the final product and gets benefit b n .
If a defector is chosen randomly from group i, the defector receives the benefit b i and s/he does not cooperate with a player in group i + 1 (figure 1b). As a result, the division of labour fails and the whole a cooperator a defector 1st group 2nd group 3rd group nth group a payoff of a cooperator in the ith showing n division of labour when cooperators are chosen before the ith group and then a defector is chosen in the ith group. Once a defector is chosen, the whole system is broken. As a result, all players suffer −g. Table 1. The payoff matrix in the baseline system for a player in the ith group. cases all being cooperator except the ith a defector before ith a defector after ith royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 system is broken down. As a result, all players in all the groups get the same negative effect g in their payoff; g can be interpreted as a bad reputation because the incomplete work gives a bad reputation to all members in the division of labour. The payoff of the defector is b i − g (table 1). After the defector is chosen from group i, the players in the later groups will not choose either cooperation or defection, and there can only be one acting defector player in the chain of the linear division of labour here. Therefore, the net payoff of players after defection occurs is −g (table 1). Table 1 indicates that, when g < x i , the game can be the PD game. Otherwise, mutual cooperation is always the best. In other words, when g > x i for all is, the cooperators' payoff is higher than the defectors. Therefore, we assume that g < x i , which means that the baseline system has a dilemma situation.
For calculating the payoff matrix of a player in group i we need to consider three cases. First, the probability that all of the players in the rest of the groups are cooperators in the chain of the division of labour is defined as c i , which is the product of the frequencies of the cooperators of the groups except i. Therefore, c i ¼ P iÀ1 j¼1 j c P n j¼iþ1 j c . Second, there are defectors in the groups before the group i, with probability d ib . Thus, it is where c ib is the probability that all the players chosen from the previous groups are cooperators. Third, the probability that there are defectors in the groups after the group i is d ia , where no defectors are in the previous groups. Thus, where c ia is the probability of not having defectors after i. Here, After each player randomly chosen from each group interacts with a player randomly chosen from the next downstream group, the expected payoff of each player in each group can be calculated. Then, within each group, players decided to imitate a strategy of others, proportional to the expected payoff relative to the total payoff in the group. Here the random change of the strategy or mutation does not occur. Therefore, this interaction can be described by the replicator dynamics of the asymmetric game without mutation [58].
The replicator equation of a cooperator in the group 1 in the baseline system is as follows: where the average payoff of cooperator in the group 1, P 1c ¼ c 1 ðb 1 À x 1 Þ þ d 1a ðb 1 À x 1 À gÞ and the average payoff of the defector in the group 1, The replicator equation for cooperators in the group i is (when 1 < i < n) as follows: where the average payoff of the cooperators in the group i is P ic ¼ c i ðb i À x i Þ þ d ib ðÀgÞ þ d ia ðb i À x i À gÞ and the average payoff of the defectors in the group i is The replicator equation for the cooperators in the group n is as follows: dn c dt ¼ n c ð1 À n c ÞðP nc À P n d Þ ¼ n c ð1 À n c Þfc n ðg À x n Þg, ð2:3Þ where the average payoff of the cooperators in the group n, P nc ¼ c n ðb n À x n Þ þ d nb ðÀgÞ and the average payoff of the defectors in the group n, P n d ¼ c n ðb n À x n À gÞ þ d nb ðÀgÞ.

The defector sanction system
Next we focus on two sanction systems, namely the defector sanction system and the first role sanction system. In the defector sanction system, the defector in the chain of the linear division of labour gets punished with a fine f, where ( f > 0) and the finding probability of the defector is ρ. In some types of linear division of labour, where monitoring a defector is too hard, ρ is very low compared with other parameters. For the defector sanction system, the payoffs are given in a similar way as the baseline except the sanction; adding the sanction of ρf to the defector's payoff. In the defector sanction system, the payoff matrix for a player in group i is in table 2.
The replicator equation for the cooperators in the group 1 is as follows: where the average payoff of the cooperators in the group 1, P 1c ¼ b 1 À x 1 À d 1a g and the average payoff of the defectors in the group 1, royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 In the defector sanction system, the replicator equation for the cooperators in the group i is (when 1 < i < n) as follows: where the average payoff of cooperators in the group i is P ic ¼ c i ðb i À x i Þ þ d ib ðÀgÞ þ d ia ðb i À x i À gÞ and the average payoff of the defectors in the group i is In the defector sanction system, the replicator equation for the cooperators in the group n is as follows: where the average payoff of the cooperators in the group n, P nc ¼ c n ðb n À x n Þ þ d nb ðÀgÞ and the average payoff of the defectors in the group n, P n d ¼ c n ðb n À g À rfÞ þ d nb ðÀgÞ.

The first role sanction system
In the first role sanction system, the fine is the same as the defector sanction system, f, and a defection in the chain of the linear division of labour is always found with probability 1, because the final product or service does not appear if defection occurs, and players can know that defection occurs without monitoring a defector. Therefore, the finding probability is one. The player in the first role always gets punished for the defection, no matter which role defected. For example, this sanction system is executed to prevent illegal dumping in Japan [53]. For the first role sanction system for the generalized ith player, the payoff matrix is the same as the baseline except that for group 1. If anyone defects, the first role gets punished, and sanction f appears in the first group's payoff matrix (table 3). The payoff matrix for a player in the group 1 in the first role sanction system is in table 3, and for a player in the group i (2 ≤ i ≤ n) is in table 4.
The replicator equation for a cooperator in the group 1 is as follows: where the average payoff of the cooperators in the group 1, P 1c ¼ b 1 À x 1 þ d 1a ðÀf À gÞ and the average payoff of the defectors in the group 1,  Table 3. The payoff matrix in the first role sanction system for group 1.
cases all being cooperator except the first a defector after the first Table 4. The payoff matrix in the first role sanction system for a player in the group 2 ≤ i ≤ n.
cases all being cooperator except the ith a defector before ith a defector after ith royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 The replicator equation for the cooperators in the groups 1 < i < n and group n are the same as the baseline model. The replicator equation for the cooperators in the group i is (when 1 < i < n) as follows: where the average payoff of the cooperators in the group i is The replicator equation for the cooperators in the group n is as follows: where the average payoff of the cooperators in the group n, P nc ¼ c n ðb n À x n Þ þ d nb ðÀgÞ and the average payoff of the defectors in the group n, P n d ¼ c n ðb n À x n À gÞ þ d nb ðÀgÞ. Table 5 shows the parameter list in our model.

The summary of results in all three systems
We find four sorts of equilibrium in all three systems in n ≥ 3 (see appendix A). One is the all cooperation equilibrium which we represent as [1 c , 2 c , …, n c ] = [1, 1, …, 1]. The second one is the first group defection equilibirum which we represent as [0, * , …, * ] where ' * ' is any value between 0 and 1. In equilibrium, i c is neutral (i ≥ 2) because the game stops after the player in the group 1 chooses defection and the players in the later roles gets the same payoff regardless of the behaviour. As a result, i c neutrally converges to any value between 0 and 1 (i > 1). The third one is the cooperation-defection mixed equilibrium which is represented as which is named the last group defection equilibrium. Appendix A proves that this system only has four types of equilibrium points. We analyse the local stability of these four equilibria with the Jacobian matrix (see appendix B). Table 6 presents the summary of the analyses. Table 6 shows that the first group defection equilibrium is one stable equilibrium in the baseline. It is also stable in the first role sanction system when c 1 (g + f ) < x 1 . However, it is stable in the defector sanction system, if ρf < x 1 − c 1 g.
The cooperation-defection mixed equilibrium is unstable in both the baseline and the first role sanction system. It is stable in the defector sanction system when ρf < x jc j g and maxfx i g jÀ1 i¼1 , rf, where a player in the group j defects and everyone cooperates in the groups before the group j.
The last group defection equilibrium is unstable in the baseline and the first role sanction system. It is stable in the defector sanction system when maxfx i g nÀ1 i¼1 , rf and x n > g + ρf.  The all cooperation equilibrium is locally stable if g . maxfx i g n i¼1 in the baseline system. This means that players in all groups are changed to cooperators in equilibrium when g, the loss caused by defectors, is higher than the cost of cooperator in the all cooperation equilibrium. However, we consider the PD situation in the baseline system, and then we can assume that g < x i . This indicates that all cooperation equilibrium is unstable in the baseline. It is stable in the defector sanction system when rf þ g . maxfx i g n i¼1 , which is held even though g < x i . Therefore, if ρf is large enough, the defector sanction system can promote the evolution of cooperation.
The all cooperation equilibrium is stable in the first role sanction system when f + g > x 1 and g . maxfx i g n i¼2 . This indicates that the first role sanction system promotes the evolution of cooperation more than the baseline system. If we consider the assumption, g < x i , which satisfies the condition of the PD game, the all cooperation equilibrium is considered stable if the condition that x 1 − f < g < x 1 is possible in the first role sanction system. Appendix B and table 6 suggest that the benefit given by the ith group player to the (i + 1)th group player, b i , is cancelled out and does not influence the local stability of each equilibrium point.
To understand the dynamics well, we will discuss three special cases as well as other cases about the cost of cooperation in the following sections.

The cost of cooperation is the same for all the groups
The simplest assumption is that x i is the same for all the 1 ≤ i ≤ n; x 1 = · · · = x n = x. After exploring the local stability conditions for each equilibrium in each of the three systems, we can summarize the results as table 7, which represents that the first group defection equilibrium is a stable equilibrium in the baseline, and is also stable in the first role sanction system when c 1 (g + f ) < x. In the defector sanction system, it is locally stable when ρf < x − c 1 g.
The cooperation-defection mixed equilibrium and the last group defection equilibrium are unstable in all the systems. This indicates that once cooperation in group 1 starts, the all cooperation equilibrium will be stable. It is intuitive because all players face the same cost −g once a player chooses defection and the cost of cooperation x is the same in all the groups. Therefore, the players in the later groups will follow the cooperators in the first group. It is meaningful to punish a defector in the earliest group to promote cooperation. Therefore, the first role sanction system works.
Our analysis suggests that the all cooperation equilibrium is stable in the baseline system and in the first role sanction system only when g > x. As we assume that g < x, which meets the PD game, the equilibrium is not stable. It is stable when ρf + g > x for the defector sanction system.
j is the first defector in [1, …, 1, 0, * , …, * ]. Here, we consider the special case where the cost in a downstream group decreases in the linear division of labour, x 1 > x 2 > · · · > x n . After exploring the local stability conditions for each equilibrium in each of the three systems, we can summarize the results as table 8, which represents that the first group defection equilibrium is a stable equilibrium in the baseline, and is also stable in the first role sanction system when c 1 (g + f ) < x 1 . In the defector sanction system, it is locally stable when ρf < x 1 − c 1 g.
The cooperation-defection mixed equilibrium and the last group defection equilibrium are unstable in all the system.
Our analysis suggests that the all cooperation equilibrium is stable in the baseline system only when g > x 1 because x 1 > x 2 > · · · > x n . When the condition of the PD game is applied, the equilibrium is not stable. It is stable when ρf + g > x 1 for the defector sanction systems (figure 2a), and is stable when f + g > x 1 and g > x 2 for the first role sanction system (figure 2b). Figure 2a,b shows that the same sanction f as the first role sanction system cannot create the evolution of cooperation in the defector sanction system, because of the low finding probability of the defector.
By comparison between tables 7 and 8, it is shown that the outcomes of this case (table 8) is the same  as table 7 which is the result when the cooperation cost is the same for all the group. The numerical analyses indicate that the parameter c 1 becomes almost zero so that this parameter does not influence the simulation outcomes in the first group defection equilibrium ( figure 2).
Surprisingly enough, figures 2a and 3 show the defector sanction system promotes cooperation even though cooperators are rare in the beginning in ρf > x 1 . In the region where ρf + g > x 1 and ρf < x 1 , the system is bistable, where the sanction needs to be higher to create all cooperation with lower i c (0), and even low sanction can create all cooperation with higher i c (0). Figure 2a also shows that the dynamics is independent of the initial condition and goes to all defection in the region of ρf + g < x 1 . Therefore, if the probability of finding and catching a defector is too low and ρf is very low, the defector sanction system never promotes cooperation. If the sanction, ρf, is large enough to be effective, the defector sanction system promotes cooperation even though the initial frequency of cooperators is low. Figure 2b shows clearly that the first role sanction system creates cooperation only when the initial frequency of cooperators in all the groups is very high. When the i c (0) comes near 0.95, cooperation only evolves with very high sanction f. When i c (0) is low, the system goes to first group defection.

The cost of cooperation is higher in higher i
Now we consider the model when the cost of cooperation rises in the downstream of the linear chain. Here we assume that, x i > x i−1 for all is. The local stability conditions for each of the four equilibria are shown in table 9.
The main difference between the condition where the cost of cooperation increases in downstream groups and the condition where the cost of cooperation decreases in downstream groups is that the defector sanction system makes the cooperation-defection mixed equilibrium as well as the last group defection equilibrium locally stable when the cost of cooperation increases in downstream groups (table 9). Figure 4a shows that the dynamics in the defector sanction system where all four of the equilibria are present when x n−1 < x n − g; in the region of ρf < x 1 the dynamics goes to the first group defection equilibrium, even with very high initial frequency of cooperators in all groups. When x j−1 < ρf < x j , and j is neither one nor the terminal, all players cooperate till the group j − 1 before the group j choosing full defection; the cooperation-defection mixed equilibrium is locally stable, as shown in figure 5. When x n−1 < ρf and x n > g + ρf, the last group defection equilibrium is locally stable. When ρf + g > x n , all players in all groups go to cooperation even though their initial frequencies of cooperators are low. g royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 When x n−1 > x n − g and x n − g > x 1 , figure 4b shows the bistable region where both the cooperationdefection mixed equilibrium and the all cooperation equilibrium are locally stable when x n − g < ρf < x j < x n−1 , where j is the first defector (1 < j < n − 1). However, the last group defection equilibrium is not locally stable, because the condition for x n−1 < ρf and x n > ρf + g does not hold. Figure 4c shows the outcomes when x n − g < ρf < x 1 ; we find the bistability of the all cooperation and the first group defection equilibrium. When x n − g < x 1 < x j−1 < ρf < x j < x n−1 there exists the bistability of the cooperation-defection mixed equilibrium and the all cooperation equilibrium. When x n − g < x n−1 < ρf there only exists the all cooperation equilibrium. The last group defection equilibrium is not locally stable. Figure 4d shows that first role sanction system can never create all cooperation, even with very high punishments f and along all the initial cooperation frequency.
In sum, the defector sanction system works as sanction and promotes the evolution of cooperation when x 1 < x 2 < · · · < x n and ρf is large enough. The first role sanction system does not work and it is equivalent to the baseline system. The numerical analyses indicate that c j becomes almost zero so that this parameter does not influence the simulation outcomes when j is the first defector group ( figure 4).

Other cases
We consider that the costs of the cooperation are given uniform randomly, do the numerical simulations in a parameter set, and then see if tables 6-9 can predict the dynamics. Figure 6a shows that the dynamics approximately converges to [1 c , 2 c , 3 c , 4 c , 5 c , 6 c , 7 c , 8 c , 9 c , 10 c ] = [0, 0.18, 0, 0, 0, 1, 1, 0, 1, 0.5] when [x 1 , x 2 , …, x 10 ] = [14,20,17,20,13,6,5,19,4,9] in the defector sanction system when g = 3 and ρf = 9. This convergence point seems to be a cooperation-defection mixed equilibrium. However, as ρf < x 1 , and c 1 converges to zero in the simulation, table 6 predicts that it is the first group defection equilibrium. The simulation outcomes also show that, as 1 c converges to 0, it is the first group defection equilibrium. If we had only done the numerical analysis by computer simulations, we would have regarded this convergent point as a cooperation-defection mixed equilibrium. Therefore, the theoretical proofs help us understand the dynamics correctly.
When [x 1 , x 2 , …, x 10 ] is randomly assigned to [4,5,6,4,5,12,10,9,7,10], we observe both the all cooperation equilibrium and the cooperation-defection mixed equilibrium are locally stable with different initial conditions. Here, n = 10, g = 3, ρf = 10. We set the initial condition for figure 6b, i c = 0.6 for all the groups, hence the dynamics converges to the all cooperation equilibrium. Table 6 also predicts the all cooperation equilibrium is locally stable here as maxfx i g 10 i¼1 ¼ 11 , rf þ g ¼ 13. If we set the initial condition as 6 c = 0.1 and i c = 0.6 (i ≠ 6), the dynamics converge to the cooperationdefection mixed equilibrium where players in the group 6 are changed to defectors (figure 6c). This can be predicted by table 6 in which maxfx i g 5 i¼1 ¼ 6 , rf ¼ 10 , x 6 À c 6 g ¼ 10:86, where c 6 converges to 0.379 in the simulation (table 6).

Discussion and conclusion
We took a system of linear division of labour where there are n roles (n ≥ 2). If a role gets subjected to defection by its defector, the labour stops there, and the players associated with the later roles do not converge to (1, 1, 1, 1, 1, 1, 1, 1, 1, 1) 1.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 get a chance to play the roles. Each player in each group gets subjected to the same loss once a player defects. We analyse three systems: the baseline system and the two sanction systems, namely the defector sanction system and the first role sanction system, to see their effect on the evolution of  outside each yellow dot shows all players in the group j are first defectors in the cooperation-defection mixed equilibrium. As x j−1 < ρf < x j is the condition for the cooperation-defection mixed equilibrium to be stable where j is the first defector. The green dots with the letter 'L' show when the dynamics evolves into the last group defection equilibrium. (A) in (a) means ρf < x 1 = 5; (B), x j − 1 < ρf < x j − c j g; (C), x n-1 < ρf < x n − g; (D), x n − g = 47 < ρf. (E) in (b) means ρf < x 1 = 5; (F), x j − 1 < ρf < x j − c j g; (G), x n − g < ρf < x n − 1 − c n − 1 g; (H), x n − g < x n − 1 − c n − 1 g < ρf. (I) in (c) means ρf < x n − g < x 1 − c 1 g; (J), royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 cooperation. After applying the replicator equation of asymmetric game, we find four equilibria, (i) where all the players in the first group are defectors, (ii) where all the players in all the groups are cooperators, (iii) where the players in the earlier groups are all cooperators and the players in the later groups except the first and the terminal group are defectors, which is called the cooperation-defection mixed equilibrium, and (iv) the last group defection equilibrium, where all the players in the last group are defectors, and all players in other groups are cooperators. Our findings are as follows: the benefit given by a cooperator in an upstream group to a player in a downstream group does not influence the evolutionary dynamics, but the cost of cooperation does. We compared two sanction systems, the defector sanction system and the first role sanction system, with the baseline system. Then, we found that the defector sanction system promotes the evolution of cooperation unless the probability of finding a defector is very low. However, when it is too hard to monitor and detect a defector, the defector sanction system does not work as sanction any more. The first role sanction system promotes cooperation when the cost of cooperation decreases in downstream groups. Otherwise, the first role sanction system is equivalent to the baseline system; it does not work as sanction. The other important point is that, in addition to the all cooperation and the first group defection equilibria, the cooperation-defection mixed equilibrium and the last group defection equilibrium can be locally stable when the cost of cooperation increases with higher i in the defector sanction system. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 Even though our results can be applied to any n if n ≥ 2, the results in n = 2 are different from those in n ≥ 3. The crucial difference is that there are three equilibrium points in n = 2: [1 c , 2 c ] = [0, * ], [1,0], [1,1]. When the costs of the cooperation decrease downstream, the local stability condition in n = 2 is the same as in n ≥ 3. When the costs of the cooperation increase downstream in the defector sanction system, in n = 2, [1,0] is locally stable if x 1 < ρf and x 2 > ρf + g. [1,1] is locally stable if ρf + g > x 2 , and [0, Ã ] is locally stable if ρf < x 1 . There is a bistable region in n = 2; if ρf < x 1 < x 2 < ρf + g holds, [0, * ] and [1,1] are locally stable, while [1, 1, …, 1] and [0, * , …, * ] are bistable in n > 2, when x n − g < ρf < x 1 (figure 4c). However, when n is larger, it becomes harder to have a bistable region in which [1, 1, …, 1] and [0, * , …, * ] are locally stable, because x n − g < ρf < x 1 < · · · < x n should be held under the assumption that g < x i .
Our study might remind us of Boyd & Richerson [59]. Because players are in a unidirectional cycle network, two neighbours play the PD game in order and it goes on repeatedly. The two strategies are as follows: (i) upstream tit for tat (UTFT): where if the upstream player cooperates/defects with the focus player, then he cooperates/defects with the downstream player and (ii) downstream tit for tat (DTFT): where the focus player cooperates/defects with the downstream player if the downstream player cooperated/defected with his own downstream player, in the previous cycle. Boyd & Richerson [59] found that DTFT evolved more than UTFT. Structurally this study might look similar to ours. However, there are some critical differences between the two. One difference is the research purpose; Boyd & Richerson [59] investigated the evolution of indirect reciprocity. The network structure of Boyd & Richerson [59] is a repeated cycle, and players observe the payoff of all players in a cycle and imitate the strategy of a player with a higher payoff. Basically, in our study, UTFT is not possible as, if the player in the upstream group defects, the game stops there and the players from the focus group do not get to choose their strategy. DTFT is not possible as the strategy of a player is premeditated and does not depend on downstream groups' player's choosing. However, Boyd & Richerson [59] give us a hint to develop the study of the division of labour. In our future work, we will apply some of their assumptions into our framework, and modify and develop our study.
We consider the special case that b i = x i−1 , which means the benefit given by a cooperator in group i − 1 is the same as the cost of cooperation paid by the cooperator; a player in the first group gives 10 000 yen to a player in the second group. 10 000 yen is the cost of the player in the group 1 and the benefit to the player in the group 2. As b i − x i = x i−1 − x i should be positive, x i decreases as the i increases. Therefore, the result of the analyses corresponds to table 8. The total sum of the net benefit of all cooperators in all roles is (x 0 − x 1 ) + (x 1 − x 2 ) + · · · + (x n−1 − x n ) = x 0 − x n ; therefore, the assumption that this situation can be interpreted as that each player in each role decides how much the upstream player keeps and distributes to the downstream players ( figure 7). For example, a cooperator in group 1 keeps x 0 − x 1 and gives x 1 to a player in group 2. If the player in the group 2 is a cooperator, the cooperator keeps x 1 − x 2 and gives x 2 to a player in the group 3. This continues before a player chooses defection. This situation has some implications not only for the division of labour but also for government planning and spending or subcontract because the model can be assimilated with the flow of government spending or subcontract (see figure 7). For government planning and spending, cooperation in the division of labour is required [60]. Government of all forms performs a very  . Time evolution with the costs of the cooperation given uniform randomly in the defector sanction system. In (a), n = 10, g = 3, ρf = 9, [x 1 , x 2 , …, x 10 ] = [14,20,17,20,13,6,5,19,4,9]. In (b,c), n = 10, g = 3, ρf = 10, [x 1 , x 2 , …, x 10 ] = [4,5,6,4,5,12,10,9,7,10]. The initial condition for (a) is i c = 0.5, and (b) is i c = 0.6 for all the groups, and that for (c) is 6 c = 0.1 and i c = 0.6 for other groups.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856 significant role in running modern country states. The roles of it are as elaborated as they can get and a single individual cannot perform all roles. Therefore, all the related tasks are divided among many ministries and the ministries also divide the tasks among their workers. For a certain work to be done at the root level, the fund should go through multiple agents as well as be planned by multiple agents. Therefore, cooperation of all the roles is key for success in that particular task. For example, if a governmental head wants to spend some money at the root level, he/she allocates the money to his/her subordinate, who also allocates a part of the money to his/her subordinate, and it goes to many levels of subordinates until reaching the goal. Every cooperator in the group i here gains the net benefit x i−1 − x i from cooperation (see figure 7). And if someone stops or does not cooperate because of corruption or other reasons, the money does not reach the goal and therefore the task fails. As a result, all players additionally get damage, −g. In return, the defector, of course, gains the money which was given to him but to every agent as a part of the government comes a bad reputation for the defection which can be set as −g. By creating cooperation among all the roles, we make sure that the players in the first role engaging in government spending do not choose defection. The reason is as follows; our results suggest that all cooperation equilibrium or first group defection equilibrium can be locally stable but the cooperation-defection mixed equilibrium is not stable in table 8. This means if a player in the first group is a cooperator, players in all other groups can be cooperators.
We did not comment particularly on the impact of the net benefit b i − x i or the distribution of the benefit for the cooperators in the system because the benefit does not influence the dynamics (see tables [6][7][8][9]. Nowak [8] shows that the cooperation can evolve in the network-structured population, where each node has k regular links and b is the benefit from a cooperator and c is a cost of cooperation, in b/c > k. However, in our work, the benefit from cooperation is cancelled out, and then we cannot summarize our result using the benefit b. In this point, our work shed a new light on the evolution of cooperation in the networks.
This work only focused on one case of the division of labour. There are other types of division of labour. Here we assumed that each player in each group obtained the payoff after the player plays the game with the player in the downstream group. While, in the other type of division of labour, each player can get the benefit after all tasks in the division of labour are completed. In our future work, we will investigate different outcomes we will obtain by analysing the various cases of the division of labour by the replicator equations of asymmetric games. In addition, we will consider other types of sanction: for example, mistakenly regarding cooperators as defectors and sanctioning them. royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 10: 220856